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Abstract 

In multivariate time series, the estimation of the covariance matrix of the observation 
innovations plays an important role in forecasting as it enables the computation of the 
standardized forecast error vectors as well as it enables the computation of confidence 
bounds of the forecasts. We develop an on-line, non-iterative Bayesian algorithm for 
estimation and forecasting. It is empirically found that, for a range of simulated time 
series, the proposed covariance estimator has good performance converging to the true 
values of the unknown observation covariance matrix. Over a simulated time series, the 
new method approximates the correct estimates, produced by a non-sequential Monte 
Carlo simulation procedure, which is used here as the gold standard. The special, but 
' important, vector autoregressive (VAR) and time- varying VAR models are illustrated by 

considering London metal exchange data consisting of spot prices of aluminium, copper, 
lead and zinc. 

. Some key words: Multivariate time series, dynamic linear model, Kalman filter, vector 

C^> autoregressive model, London metal exchange. 
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Introduction 



Multivariate time series receive considerable attention because a great deal of time series 
data arrive in vector form. Whittle (1984) and Liitkepohl (1993) discuss VARMA models 
for vector responses, whilst Harvey (1989, Chapter 8), West and Harrison (1997, Chapter 
16) and Durbin and Koopman (2001, Chapter 3) extend this work to state space models 
for observation vectors. In econometrics most studies of state space models focus on trend 
estimation, signal extraction and volatility. A review of recent developments of state space 
models in econometrics can be found in Pollock (2003). Barassi et al. (2005) and Gravelle and 
Morley (2005) give applications of the Kalman filter to interest rates data and Harvey et al. 
(1994) use Kalman filter techniques to estimate the volatility of foreign exchange rates using 
multivariate stochastic volatility (MSV) models. With the exception of multivariate GARCH 
and MSV models, which focus on the prediction of the volatility, it is usually desirable to 
use a structural state space model to forecast time series vectors (e.g. foreign exchange 
rates, monthly sales, interest rates, etc) and to estimate the observation innovation covariance 
matrix of the underlying time series. For such applications and for short term forecasting the 
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above covariance matrix can be assumed time-invariant, but unknown, and its estimation is 
the main aim of this paper. 

The estimation of the observation covariance matrix plays an important role in forecasting. 
Firstly we note that, under the general multivariate dynamic linear model (see equation (fTJ) 
below), the multi-step forecast mean of the response time series vector is a non-linear function 
of the observation covariance matrix (West and Harrison, 1997, Chapter 16). Secondly, the 
computation of the standardized forecast error vectors requires a precise estimation of the 
observation covariance matrix and thus a miss-specification of the observation covariance 
matrix can lead to false results regarding the evaluation and judgement of the model. Thirdly, 
the multi-step forecast covariance matrix is a linear function of the observation covariance 
matrix and the former is of particular interest; the forecast covariance matrix can explain the 
variability of the forecasts and hence it can enable the computation of confidence bounds for 
the forecasts. Finally, the precise estimation of the observation covariance matrix gives an 
accurate estimation of the cross-correlation structure of the several component time series, 
which is particularly useful, especially for financial time series. For all the above reasons the 
study of the estimation of the observation covariance matrix is worthwhile and its contribution 
to forecasting for multivariate time series is paramount. 

The problem of the estimation of the observation innovation variance for univariate state 
space models has been well reported (West and Harrison, 1997, §4.5; Durbin and Koopman, 
2001, §2.10), however, for vector time series this problem becomes considerably more com- 
plex and the available methodology consists of special cases, approximations and iterative 
procedures. 

Let yt be a p-dimensional observation vector following the Gaussian dynamic linear model 
(DLM): 

y t = F'6 t + e t and 6 t = G0 t - 1 +w u (1) 

where 0t is a (f-dimensional Markovian state vector, F is a known d x p design matrix and 
G a known d x d transition matrix. The notation F' is used for the transpose matrix of 
F. The distributions usually adopted for {e^}, {uJt} and 0o are the multivariate Gaussian, 
i.e. et ~ A/" p (0,E), oj t ~ A/d(0, O) and 6q ~ A/d(mo, Po), for some known priors tuq and Po- 
The innovation vectors {et} and {tot} are assumed individually and mutually uncorrelated 
and they are also assumed uncorrelated with the initial state vector 9q, i.e. for all t ^ s: 
E(e t e' s ) = 0, E(u t u;' s ) = 0, and for all t,s > 0: E(e t w' a ) = 0, E(e t 0' ) = and E(u) t 0' ) = 0, 
where E(-) denotes expectation. The covariance matrices E and f2 are typically unknown 
and their estimation or specification is a well known problem. The interest is centered on 
the estimation of E, while O can be specified a priori (West and Harrison, 1997, Chapter 6; 
Durbin and Koopman, 2001, §3.2.2). 

Several methods have been proposed, for the estimation of E. Harvey (1986) and Quin- 
tana and West (1987) independently introduce matrix-variate DLMs, which are matrix-variate 
linear state space models allowing for covariance estimation. Harvey (1986) proposes a like- 
lihood estimator, while Quintana and West (1987) propose a Bayesian estimation modelling 
E with an inverted Wishart distribution. Harvey (1986) 's model is reported and further de- 
veloped in Harvey (1989), Fernandez and Harvey (1990), Harvey and Koopman (1997) and 
Moauro and Savio (2005), while Quintana and West (1987) 's model is reported and further 
developed in Quintana and West (1988), Queen and Smith (1992), West and Harrison (1997), 
Salvador et al. (2003), Salvador and Gargalo (2004) and Salvador et al. (2004). However, 
both suggestions (Harvey (1989)'s and Quintana and West (1987)'s) are criticized in Barbosa 



2 



and Harrison (1992) where it is shown that the above models are restrictive in the sense that 
one can decompose the response vector yt into several scalar time series and model each of 
these time series individually, using univariate DLMs. Barbosa and Harrison (1992) propose 
an approximate algorithm for the general DLM ([1]), but their main assumption seems rather 
unjustified, since it suggests that for any p x p matrix C it is X 1//2 C£ -1 / 2 = S 1 / 2 CS~ 1 / 2 , 
where X is a point estimate of £ and the notation £ 1//2 stands for the symmetric square root 
of £ (Gupta and Nagar, 1999, p. 7). This assumption holds clearly when S 1 / 2 , C commute 
and when E 1 / 2 = £*, C commute, where (£*) 2 is any particular realization of £. However, 
in general the above assumption is difficult to check since £ is the unknown covariance ma- 
trix subject to estimation. In addition, that assumption seems to be probabilistically quite 
inappropriate, since it translates that the non-stochastic quantity £ 1//2 C£ _1//2 equals the 
stochastic quantity S 1 / 2 CE _1//2 with probability 1. A possible analysis can be obtained in 
special cases where £ is diagonal or when the off-diagonal elements of £ are all common. Tri- 
antafyllopoulos and Pikoulas (2002) and Triantafyllopoulos (2006) adopt the model of Harvey 
(1986) and they provide an improved on-line estimator for £ based on a standard maximum 
likelihood technique. The problem is again that the models discussed lack the general for- 
mulation of the state space model ([1]); e.g. one can easily show that all above models are 
special cases of model (pQ). Iterative procedures via maximum likelihood and Markov chain 
Monte Carlo (MCMC) techniques are available, but they tend to be slow, especially as the 
dimension of the observation vector p increases. Kitagawa and Gersch (1996), Shumway and 
Stoffer (2000, Chapter 4), Durbin and Koopman (2001, Chapter 7) and Doucet et al. (2001) 
discuss univariate modelling with iterative methods, but their efficiency in multivariate time 
series is not yet explored. Barbosa and Harrison (1992) and West and Harrison (1997, §16.2.3) 
discuss the problem of inefficiency of iterative methods and they point out that the number 
of parameters to be estimated in S is p(p+ l)/2, which rapidly increases with the dimension 
p of the response vector, e.g. for p = 10 there are 55 distinct parameters in £ to be estimated. 
In addition to this Dickey et al. (1986) discuss relevant issues on specifying and assessing the 
prior distribution of £ pointing out difficulties in the implementation of iterative procedures. 

In this paper we propose a new non-iterative Bayesian procedure for estimating £ and for 
forecasting yt- This procedure offers a novel estimator of £ for the general DLM ([I]). The 
proposed estimator is empirically found to converge to the true value of £ and this estimator 
approximates well the respective estimators in the special cases of the conjugate univariate 
and matrix-variate DLMs. A comparison with a non-sequential Monte Carlo simulation shows 
that the new method produces estimates close to the MCMC. The focus and the benefit 
employing the new method is on on-line estimation and therefore no attempt has been made 
to compare the proposed algorithms with sequential iterative procedures. The reason for this 
is justified by the above discussion and the interested reader should refer to Dickey et al. 
(1986) and West and Harrison (1997, §16.2.3). The proposed forecasting procedure for model 
(H|) is applied to the important model subclasses of vector autoregressive (VAR) and VAR 
with time-dependent parameters. These models are illustrated by considering London metal 
exchange data, consisting of spot prices of aluminium, copper, lead and zinc (Watkins and 
McAleer, 2004). 

We begin by developing the main idea of the paper and giving the proposed algorithm. The 
performance of this algorithm is illustrated in the following section by considering simulated 
time series data; a comparison with a Monte Carlo simulation is performed. The proceeding 
section gives an application to vector autoregressive modelling, which is used to analyze 
London metal exchange data, in the following section. The appendix details a proof of a 
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theorem in the paper and it describes the MCMC simulation procedure. 

Main Results 

Denote with y = (y±,y2, ■ ■ ■ ,yt) the information set comprising data up to time t, for some 
positive integer t > 0. Let mt and Pt be the posterior mean and covariance matrix of 9t\y 
and St be the posterior expectation of E, i.e. E(E|y*) = St- Let yt(l) = E(yt+i|?/) = F'Gmt 
be the one-step forecast mean at time t and Qt+i = Var(y i+ i|?/) = F'R t +iF + St be the 
one-step forecast covariance matrix at t, where Rt+i = GPfG' + f2. Upon observing yt+i, we 
define the one-step forecast error vector as et+i = yt+i — VtiX)- The next result (proved in 
the appendix) gives an approximate property of St- 

Theorem 1. Consider the dynamic linear model Let E be the covariance matrix of 

the observation innovation et and assume that lim^oo St = E, where E(E|y ) = St is the 
true posterior mean of E given y l . Let uq be a positive scalar and So = E(E) be the prior 
expectation o/E. If E is bounded, then for large t the following holds approximately 

St = ^ (noSo + t S&QT^QT^SlS) , (2) 

where ei, Qi are defined above and Sj^, Q { denote respectively the symmetric square roots 
of the matrices Si—i, Qj l based on the spectral decomposition factorization of symmetric 
positive definite matrices (i = 1, 2, . . . , t). 

Conditionally now on E = S, for a particular value S, we can apply the Kalman filter 
to the DLM dU) and obtain the posterior and predictive distributions of #t|E = S, y and 
yi+h|E = S,y t , for a positive integer h > 0, known as the forecast horizon. Theorem Q] 
motivates approximating the true posterior mean St by S = St, which is produced from 
application of equation (J2J), given a particular data set y = (yi, yi, . . . , yt). Thus we obtain 
the following algorithm: 

Algorithm 1. (a) Prior distribution at time t = 0: #o|E = So ~ Nd{rho, Pq), for some mo, 
Po and So- 

(b) Posterior distribution at time t: 0f |E = St, y t ~ Md{rht, Pt), where et = yt — yi-i(l) and 

fht = Gfht-i + A t et, P t = GPt-iG' + Q — A t Q t A' t , A t = (GP t -iG' + tyFQ^ 1 , 

% = ^T t Lso + t Sm i/2 ^Q; 1/2 S^ - 

(c) h-step forecast distribution at t: y^/jE = St,y l ~ J\f p {yt{h),Qt{h)}, where yt{h) = 
F'G h m t and 

h-l 

Q t (h) = F'G h Pt(G h )'F + F'G i VL(G i )' F + S t . 

i=0 
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In the special case of matrix-variate DLMs (Harvey, 1986; West and Harrison, 1997, §16.4) 
the estimator St approximates the true posterior mean of E produced by an application of 
Bayes' theorem, assuming a prior inverted Wishart distribution for £. To see this, note 
that in the matrix-variate DLM (this model is briefly in page 1151 see equation ©), F is a 
<i-dimensional design vector and Q t = U t S t -i with U t = F'R t F + 1 and so equation ([2D can 
be written recursively as 

St = n^ l {nt-iSt-i + e t e' t /Ut) and n t = n t _i + 1 = n + t. (3) 

It is easy to verify that the assumption lim^oo St = E is satisfied, since lim^oo St = 
lim^oo E(E|y ) and lim^oo Var{vech(£)|y*} = 0, where vech(-) denotes the column stacking 
operator of a lower portion of a symmetric matrix. For p = 1 the matrix-variate DLM is 
reduced to the conjugate Gaussian/gamma DLM (West and Harrison, 1997, §4.5). It turns 
out that the estimator St of equation ([2]) approximates the analogous estimators of all existing 
conjugate Gaussian dynamic linear models. 

It is worth noting that Theorem [1] and Algorithm Q] have been presented for the state 
space model ([T]) having time-invariant components F, G and £1. However, these results 
apply if some or all of the above components change with time. In addition, if the evolution 
covariance matrix is time-dependent, it can be specified via discount factors (West and 
Harrison, 1997, Chapter 6). This is a useful consideration, because in practice the signal Qt 
is unlikely to have the same variability over time. 

For the application of Algorithm Q] the initial values mo, Po, no and So must be specified, 
mo can be specified from historical information from the underlying experiment and Po can 
be set as a typically large diagonal matrix, e.g. Po = lOOOip, reflecting a low precision (or 
high uncertainty) on the specification of the moments of 9q- The scalar no can be set to 
no = 1 (in the special case of matrix-variate DLMs, hq is the prior degrees of freedom). 
So is a prior estimate of E and requires at least a rough specification. As information is 
deflated in time series, a miss-specification of So may not affect much the posterior estimate 
St, especially in the presence of large data sets. However, in many cases and especially in 
financial time series, a miss-specification of So can lead to poor estimates of E. Here we 
suggest that a diagonal covariance matrix can be used, where the diagonal elements of So 
reflect the empirical expectation of the diagonal elements of E. This expectation can be 
obtained by studying historical data and other qualitative pieces of information, which are 
usually available to practicing experts of the experiment or of the application of interest. 

Simulation Studies 

Empirical Convergence of St 

We have generated 1000 bivariate time series {to}*^!^' ' '1000 f rom several state space models 
and then we have averaged the 1000 estimates S^t (produced by each of the 1000 time series) 

and compared the average S t = 1000" 1 J2l = i Si t t with the true value of E. 

Since in practice complicated models are decomposed into simple models comprising local 
level, polynomial trend and seasonal components (Godolphin and Triantafyllopoulos, 2006), 
we consider estimation separately in such different component models. We have three mod- 
elling situations of interest: situation 1 (bivariate local level models); situation 2 (bivariate 
linear trend models); and situation 3 (bivariate seasonal models). For each of the above 
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three situations we have generated 1000 bivariate time series, each of length 500, using three 
different covariance matrices E, i.e. 

1 7 
7 50 / 

Throughout the simulations we have chosen high correlations for each Ej (i = 1,2,3), since 
uncorrelated or approximately uncorrelated state space models can be handled easily by 
employing several univariate state space models. The priors of Ej are chosen as S^j = I2, 

S&j = 150/ 2 and 5f } = diag{3, 40} (t = 1, 2, . . . , 1000). The diagonal choice for the priors 

S^q has been done for: (a) operational simplicity (the user is likely to expect rough values 
for the diagonal elements of Ej, rather than for the associated correlations) and (b) judging 
how the estimation of Ej is affected by improper priors in the sense of setting the off-diagonal 
elements of S^q to zero, while the true values of Ej posses high correlations. Throughout 
the models the remaining settings are uq = 1, $7 = I2, = [0 0]' and Po = 1000/2) f° r an 
models. Table [TJ shows the results. There are three blocks of columns, each showing results of 
the state space model considered, namely local level model (LL or block 1), linear trend model 
(LT or block 2) and seasonal model (SE or block 3). In each block the first column shows the 

mean of the average 5 = 500 -1 Y2t=i &t of all St- The second column shows the average S^oo 
at time point t = 100. Likewise the third column shows the respective £500 averaged over all 
1000 series. The rows in Table [1] show the picture of St over the three different values of E, 
e.g. Ei, E2 and E3. The average estimate of the correlations is also shown and it is marked 
in the table by p. The results suggest that, generally, the LL model has the best performance 
as opposed to the LT and the SE model, although we note that 012 = 3 (covariance in Ei) is 
estimated better from the LT model. It appears that the estimator St for all models converges 
to the true values of E, but the rate of convergence depends on the underlying state space 
model (here LL performs faster convergence) and on the prior Sq. 

Table [2] shows the averaged (over all 1000 simulated time series) mean vector of squared 
standardized one-step forecast errors (MSSEW), for each of the three models (LL, LT, SE) 
and for each of E (Ei,E2,E3). For comparison purposes, Table [2] also shows the respective 
values of the MSSE^ when Ej is the true value. The target value of the MSSE^ is [1 1]. We 
see that the MSSE^ approaches the respective MSSE*- 2 ^ and this demonstrates the accuracy 
of the estimator St- We observe that under E3, the MSSE^ 1 ) has values significantly smaller 
than 1 as compared to the MSSE*- 2 ^ using the true value of E3. 

Comparison of the Local Level Model with MCMC 

We have simulated a single local level model under the observation covariance matrix E — E^ 
and the relevant model components of the local level model of the previous sub-section. We 
apply Algorithm [TJ and we compare it with a state of the art MCMC estimation procedure 
based on a blocked Gibbs sampler suitable for state space models (Gamerman, 1997, p. 
149); the MCMC procedure we use is described in the appendix. The MCMC estimation 
procedure is an iterative non-sequential MCMC procedure and its role in this section is to 
provide a means of comparison with the non-iterative procedure of Algorithm [TJ MCMC is 
the gold standard, since it produces (given enough computation) exact computation of St- But 
MCMC is impractical; the new proposed method is a quick, practical and easily implemented 
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Table 1: Performance of the estimator St (of Algorithm [T]) for 1000 simulated bivariate 
dynamic models generated from a local level model (LL), a linear trend model (LT) and a 
seasonal model (SE) under three observation covariance matrices Si, £2 an d S3. 



Model 


LL 






LT 






SE 






£ = £; 


I 


Sioo 


S500 




Sioo 


5*500 


f 


Sioo 


5*500 


a u = 2 


1.945 


1.938 


1.997 


2.572 


2.721 


2.392 


2.171 


2.207 


2.165 


ai2 = 3 


2.798 


2.770 


2.920 


2.988 


3.029 


3.026 


2.314 


2.186 


2.589 


022 = 5 


4.722 


4.685 


4.899 


4.547 


4.489 


4.777 


4.399 


4.283 


4.694 


p = 0.948 


0.923 


0.919 


0.933 


0.874 


0.867 


0.895 


0.748 


0.711 


0.812 


0-11 = 100 


100.039 


99.931 


100.303 


98.271 


98.157 


98.368 


98.731 


98.806 


99.602 


CT12 = 85 


83.133 


83.028 


84.757 


79.471 


78.969 


82.660 


79.627 


79.427 


83.254 


CT22 = 80 


80.430 


80.277 


80.353 


78.917 


78.865 


79.822 


79.755 


79.886 


79.896 


p = 0.950 


0.927 


0.927 


0.944 


0.902 


0.897 


0.933 


0.897 


0.894 


0.933 


an = 1 


1.124 


1.135 


1.101 


1.200 


1.234 


1.126 


1.184 


1.202 


1.151 


0-12 = 7 


6.506 


6.457 


6.735 


5.388 


5.177 


5.904 


5.764 


5.623 


6.234 


022 = 50 


49.305 


49.375 


49.840 


48.518 


48.579 


49.392 


48.816 


49.038 


49.540 


p = 0.989 


0.784 


0.862 


0.909 


0.706 


0.668 


0.791 


0.758 


0.732 


0.825 



Table 2: Mean vector of squared standardized one-step forecast errors (MSSEW) of the mul- 
tivariate dynamic model of Algorithm [H The index i = 1, 2 refers to when £ is estimated by 
the data (i = 1) and when £ is assumed known (according to the simulations) for comparison 
purposes (i = 2). The notation LL, LT, SE and £1, £2, £3 is the same as in Tabled) 





mssew 


MSSE( 2 ) 


LL 


(Si) 


0.994 


1.071 


0.999 


0.995 


LL 


(£2) 


0.939 


0.914 


0.999 


0.999 


LL 


(S3) 


0.773 


1.026 


0.998 


0.997 


LT 


(Si) 


0.875 


1.141 


0.992 


0.996 


LT 


(S2) 


0.900 


0.895 


1.002 


0.997 


LT 


(S3) 


0.774 


1.031 


1.000 


0.996 


SE 


(Si) 


0.930 


1.092 


0.997 


0.999 


SE 


(S 2 ) 


0.903 


0.864 


0.998 


0.996 


SE 


(S3) 


0.805 


1.026 


0.998 


0.996 
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Table 3: Bivariate simulated local level dynamic linear model. Showed are: real versus esti- 
mated values of S = {0y}i,3=i,2 an d the correlation coefficient p. The first column indicates 
how many observations were used in each estimation. 



N 


Real 


MCMC 


New 


Real 


012 

MCMC 


New 


Real 


022 

MCMC 


New 


Real 


P 

MCMC 


New 


100 


2.00 


1.68 


1.24 


3.00 


2.35 


1.82 


5.00 


4.12 


3.69 


0.95 


0.90 


0.85 


150 


2.00 


1.78 


1.34 


3.00 


2.44 


1.91 


5.00 


3.97 


3.72 


0.95 


0.92 


0.86 


200 


2.00 


1.76 


1.46 


3.00 


2.48 


2.04 


5.00 


4.02 


3.77 


0.95 


0.94 


0.87 


250 


2.00 


2.04 


1.64 


3.00 


2.89 


2.33 


5.00 


4.50 


4.17 


0.95 


0.95 


0.89 


300 


2.00 


1.92 


1.65 


3.00 


2.78 


2.39 


5.00 


4.44 


4.31 


0.95 


0.95 


0.90 


350 


2.00 


1.90 


1.69 


3.00 


2.76 


2.46 


5.00 


4.46 


4.40 


0.95 


0.95 


0.90 


400 


2.00 


2.01 


1.76 


3.00 


2.90 


2.56 


5.00 


4.59 


4.50 


0.95 


0.96 


0.91 


450 


2.00 


2.05 


1.82 


3.00 


2.97 


2.66 


5.00 


4.71 


4.65 


0.95 


0.96 


0.92 


500 


2.00 


2.11 


1.85 


3.00 


3.09 


2.74 


5.00 


4.90 


4.79 


0.95 


0.96 


0.92 



approximation. In this section we compare the new method with the gold standard in order 
to show how good is the approximation. Tables [3] and H] give the results; the former shows 
the estimates of S with both methods (MCMC and Algorithm [1]) and the latter shows the 
performance of the one-step forecast errors for both methods. In Table H] the one-step forecast 
error vector et = [e\t e-2t]' and the mean vector of squared one-step forecast errors are shown 
for several values of t under both estimation methods. We observe that the new method 
(of Algorithm [T]) approximates well the MCMC estimates, especially for large values of time 
t = N. 

We note that MCMC should not be considered as a better method as compared to the 
proposal of Algorithm[Tl since MCMC is an iterative and in particular in this paper it is a non- 
sequential estimation procedure. The application of sequential MCMC estimation (Doucet et. 
al, 2001) often experience several challenges as for example time-constraints, availability for 
general purpose algorithms, prior-specification, prior-sensitivity, fast monitoring and expert 
intervention features. The proposal of this paper provides a strong modelling approach allow- 
ing for variance estimation in a wide class of conditionally Gaussian dynamic linear models 
and this section shows that for large time periods its performance is close to Monte Carlo 
estimation. 

Application to VAR and TVVAR Time Series Models 

The dynamic model (H|) is very general and an important subclass of ([TJ is the popular vector 
ARMA model. In recent years vector autoregressive (VAR) models have been extensively 
developed and used, especially for economic time series, as in Doan et al. (1984), Litterman 
(1986), Kadiyala and Karlsson (1993, 1997), Ooms (1994), Johansen (1995), Uhlig (1997), Ni 
and Sun (2003), Sun and Ni (2004) and Huerta and Prado (2006). 

Our discussion in this section includes two important subclasses of model (Q]), which can 
be used for a wide-class of stationary and non-stationary time series forecasting. The first is 
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Table 4: Bivariate simulated local level dynamic linear model. Showed are: one-step forecast 
errors at time t = N and the squared sums of the forecasting errors up to time N. 





eiTV 


e2N 






N 


MCMC 


New 


MCMC 


New 


MCMC 


New 


MCMC 


New 


100 


-2.14 


-2.22 


-2.46 


-2.49 


4.85 


4.84 


8.54 


8.59 


150 


-0.14 


-0.28 


-3.88 


-3.92 


4.83 


4.86 


8.21 


8.26 


200 


-1.02 


-1.09 


-0.12 


0.05 


5.02 


5.04 


8.10 


8.13 


250 


-0.24 


-0.20 


-1.45 


-1.47 


5.25 


5.30 


8.72 


8.76 


300 


-1.72 


-1.79 


-0.61 


-0.71 


5.01 


5.12 


8.91 


8.95 


350 


-0.42 


-0.46 


-1.91 


-1.91 


5.12 


5.12 


9.01 


9.04 


400 


-0.42 


-0.54 


-2.26 


-2.29 


5.21 


5.22 


9.09 


9.12 


450 


-3.91 


-4.02 


-5.42 


-5.41 


5.28 


5.29 


9.32 


9.33 


500 


1.48 


1.68 


-0.93 


-0.98 


5.24 


5.24 


9.54 


9.54 



the VAR model of known order I > 1, defined by 

y t = + $ 2 ?/t-2 + ■ • • + &m-t + et, e t ~A^(0,S), (4) 

where <l>i, <3?2 5 • • • > $e ar e px p matrices of parameters. In the usual estimation of VAR, 
stationarity has to be assumed and so the roots of the polynomial (in z) 

\I P - $\z - ®2Z 2 $ez e \ = 

should lie outside the unit circle. In standard theory @ may not assume a Gaussian distri- 
bution for et, although in practice this is used for operational simplicity. It is also known that 
for a high order £ model @ approximates multivariate moving average models, which are 
typically difficult to estimate and this makes the VAR even more attractive in applications. 
It is also known that for general on-line estimation and forecasting, the covariance matrix 
X either has to be assumed known or it has to be diagonal. This is a major limitation, 
because it means that either the modeller knows a priori the cross-correlation between the 
series {yu}, {y2t}, ■■■ , {y P t}, where y t = [yu y2t • • • U P t]' , or that the p scalar time series 
are all stochastically uncorrelated, in which case it is more sensible to use several univariate 
AR models instead. Recently, the need for estimation of £ as a full covariance matrix (e.g. 
where S has p(p + l)/2 elements to be estimated) is considered, but the existing estimation 
procedures include necessarily iterative estimation via importance sampling (Kadiyala and 
Karlsson, 1997). Ni and Sun (2003) point out that from a frequentist standpoint ordinary 
least squares and maximum likelihood estimators of (J1J) are unavailable. These authors state 
that asymptotic theory estimators may not be applicable for VAR (especially when {yt} is a 
short-length time series). Ni and Sun (2003), Sun and Ni (2004) and Huerta and Prado (2006) 
propose Bayesian estimation of the autoregressive parameters <3?i and S, based on MCMC. It 
follows that for model ([1]) when E is unknown, only iterative estimation procedures can be 
applied. Our proposal for on-line estimation of S gives a step forward to the estimation and 
forecasting of VAR models and it is outlined below. 

We propose a generalization of the univariate state space representation considered in West 
and Harrison (1997, §9.4.6). Other state space representations of the VAR are considered in 
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Huerta and Prado (2006), but these representations, usually referred to as canonical represen- 
tations of the VAR model (Shumway and Stoffer, 2000) are not convenient for the estimation 
of E, because £ is embedded into the evolution equation of the states t . First note that we 
can rewrite flU) as y t = <$>X t + e t , where $ = [$i $ 2 • • ■ ®e] and X t = y' t _ 2 ■ ■ ■ y' t _ e ]' 

and so we can write 

y t = F[B + e t = {X[ ® J p )vec($) + e t , (5) 

where vec(-) denotes the column stacking operator of a portion of a matrix and <8> denotes 
the Kronecker or tensor product of two matrices. Model © can be seen as a regression-type 
time series model and it can be handled by the general Algorithm [1] for model ([I]) if we set 
G = I p , = and if we replace F by the time- varying Ft = Xt<2)I p . Thus we can readily 
apply Algorithm [1] to estimate £ and 6 or $i, <3?2, ■ ■ ■ , $e- 

Moving to the time- varying vector autoregressive (TVVAR) time series, in recent years 
there has been a growing literature for TVVAR time series. Kitagawa and Gersch (1996), 
Dahlhaus (1997), Francq and Gautier (2004) and Anderson and Meerschaert (2005) study 
parameter estimation based on the asymptotic behaviour of TVVAR and time- varying ARMA 
models. From a state space standpoint West et al. (1999) propose a state space formulation 
for a univariate time-varying AR model applied to electroencephalographic data. In this 
section we extend this state space formulation to a vector of observations and hence we can 
propose the application of Algorithm [T] in order to estimate the covariance matrix of the error 
drifts of the TVVAR model. 

Consider that the p- vector time series {yt} follows the TVVAR model of known order £ 
defined by 

y t = $uVt-i + $2tyt-2 + ■ ■ ■ + $etVt-i + et, e t ~ jV p (0,S), (6) 

where <3?it, $2t; • • • , &et are the time- varying autoregressive parameter matrices. The model 
can be stationary, locally-stationary or non-stationary depending on the roots of the t poly- 
nomials (in z) 

\I p - $i t z - $2tz 2 §uA = 0. 

Typical considerations include the local stationarity where there are several regimes for which, 
locally, {yt} is stationary, but globally {yt} is non-stationary. Also the time-dependent param- 
eter matrices &u can allow for an improved dynamic fit as opposed to the static parameters 
of the VAR. 

In our development we adopt a random walk for the evolution of the parameters $u 
(i = 1,2, . . . ,£), although the modeller might suggest other Markovian stochastic evolution 
formulae for &n . The random walk evolution is the natural consideration when {yt} is assumed 
locally stationary. Hence we can rewrite model © in state-space form as 

y t = <f> t X t + e t = F[B t + e t and t = B t -\ + u u (7) 

where X t = [y'^ y' t _ 2 ■■■ y' t _ e }' , F t = X t ® I p , $ t = [$ w $ 2 t ■■■ &t = vec($ t ) and 

oJt ~ AC,2^(0, f2), for some transition covariance matrix Model ((7|) is reduced to ([5]) when 
Vi = 0, in which case Ot = Ot-i = &■ After specifying $7, we can directly apply Algorithm Q] to 
the state space model (|7|) and thus we can obtain an algorithm for the estimation of S, for 
the estimation of 9t or <J>it, $2t, • • • , &£t and for forecasting the series {yt}- 
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Aluminium spot price (US$ per tonne) 





Lead spot price (US$ per tonne) 




50 100 150 

Time 




Figure 1: LME data, consisting of aluminium, copper, lead and zinc spot prices (in US dollars 
per tonne of each metal). 



11 



London Metal Exchange Data 



In this section we analyze London metal exchange (LME) data consisting of official spot prices 
(US dollars per tonne of metal). LME is the world's leading non-ferrous metals' market, 
trading currently highly liquid contracts for metals, such as aluminium, aluminium alloy, 
copper, lead, nickel, tin and zinc. According to the LME website ( |http : //www . lme . co . uk"7| ) 
"LME is highly successful with a turnover in excess of US$3,000 billion per annum. It also 
contributes to the UKs invisible earnings to the sum of more than £250 million in overseas 
earnings each year." More information about the functions of the LME can be found via its 
website (see above) ; the recently growing literature on the econometrics modelling of the LME 
can be found in the review of Watkins and McAleer (2004). 

We consider forecasting for four metals exchanged in the LME, namely aluminium, copper, 
lead and zinc. The data are provided from the LME website for the period of 4 January 2005 
to 31 October 2005. After excluding weekends and bank holidays there are N = 210 trading 
days. We store the data into the 4x1 vector time series {yt}t=i,2,...,2io and yt = [yu 2/2* Vzt Hit]', 
where yu denotes the spot price at time t of aluminium, y2t denotes the spot price at time t 
of copper, y^t denotes the spot price at time t of lead and yu denotes the spot price at time 
t of zinc. The data are plotted in Figure [TJ 

We propose the VAR and TVVAR models of the previous section; the motivation of this 
being that from Figure [1] the evolution of the data seems to follow roughly an autoregressive 
type model. Indeed there is an apparent trend with no seasonality, which can be modelled 
with a trend model or with a VAR or TVVAR model of the previous section. Here we 
illustrate the proposal of VAR and TVVAR models, which, according to the previous section, 
can estimate the covariance matrix of j/t, given the state parameters, and thus the correlation 
structure of {yt} can be studied. Other models for this kind of data have been applied in 
Triantafyllopoulos (2006) and we can envisage that the models of West and Quintana (1987) 
can also be applied to the LME data. 

First we apply the algorithms of the previous section to several VAR and TVVAR models 
of different orders in order to find out which model gives the best performance. Performance 
here is measured via the mean vector of squared standardized one-step forecast errors (MSSE) 
and the mean vector of absolute percentage one-step forecast errors (MAPE). The first is 
chosen as a general performance measure taking into account the estimation of the covariance 
matrix £ and the second is chosen as a generally reliable percent performance measure. Table 
Oshows the results of 10 VAR(i) and TVVAR(i) models (first column) of order i = 1, 2, . . . , 10. 
The discount factor 5 refers to the discounting of the evolution covariance matrix of the state 
parameters 6t; 5 = 1 refers to a static 9t = (VAR model), while 5 < 1 refers to a dynamic 
local level evolution of 6t = #t_i + oJt (TVVAR model). Table [5] shows that the performance 
of the TVVAR is remarkable compared with the performance of VAR, which produces very 
high MSSE throughout the range of i. Out of the VAR models, the best is the VAR(l), which 
still produces very large MSSE. This indicates that a moving average (MA) model is unlikely 
to produce good results at all, as the MSSE of the VAR increases with the order i. Also the 
approximation of a MA model with a high order VAR model will include a large number of 
state parameters to be estimated and this will introduce computational problems. 

Therefore, our attention is focused on the TVVAR models. From a computational stand- 
point we note that as the order increases <5 can not be too low, because then there are 
computational difficulties in the calculation of the symmetric square root of Qt, used for the 
estimation of St (the estimate of £). Lower values of 5 work better (Triantafyllopoulos, 2006) 



12 



Table 5: Mean vector of squared standardized one-step forecast errors (MSSE) and mean 
vector of absolute percentage one-step forecast errors (MAPE) of the multivariate LME time 
series {yt}- The first column indicates several VAR and TVVAR models. 





MSSE 


MAPE 


VAR(l) 




6.614 


16.782 


7.655 


18.370 





033 





071 





143 





057 


TVVAR(l): 5 = 


0.1 


2.430 


1.764 


0.622 


1.852 





059 





053 





084 





076 


VAR(2) 




19.610 


15.966 


11.934 


10.271 





081 





226 





201 





101 


TVVAR(2): 5 = 


0.35 


1.296 


1.743 


1.228 


1.822 





065 





057 





116 





095 


VAR(3) 




11.777 


23.715 


9.906 


9.058 





585 





345 





480 





246 


TVVAR(3): 6 = 


0.65 


2.254 


3.149 


2.222 


2.180 





074 





053 





132 





108 


VAR(4) 




39.979 


54.892 


28.407 


19.169 





159 





103 





235 





161 


TVVAR(4): 5 = 


0.6 


1.389 


1.802 


1.210 


1.329 





101 





072 





179 





147 


VAR(5) 




18.592 


16.605 


15.474 


12.570 





203 





076 





392 





248 


TVVAR(5): 5 = 


0.7 


1.429 


2.269 


1.651 


1.677 





114 





079 





208 





171 


VAR(6) 




24.910 


19.085 


14.584 


17.784 





206 





134 





320 





197 


TVVAR(6): 5 = 


0.75 


1.828 


2.705 


1.683 


1.757 





132 





089 





243 





197 


VAR(7) 




21.722 


38.054 


14.597 


14.180 





330 





092 





422 





490 


TVVAR(7): 5 = 


0.75 


1.366 


2.044 


1.191 


1.531 





148 





101 





280 





223 


VAR(8) 




28.985 


35.867 


11.291 


16.370 





515 





325 





812 





563 


TVVAR(8): 5 = 


0.8 


2.130 


2.900 


1.637 


1.910 





168 





111 





326 





249 


VAR(9) 




40.229 


53.798 


12.249 


19.691 





393 





184 





416 





411 


TVVAR(9): 5 = 


0.95 


14.042 


21.011 


6.724 


8.708 





207 





124 





352 





284 


VAR(IO) 




46.791 


49.869 


16.240 


23.974 





611 





306 





751 





694 


TVVAR(IO): 5 -- 


= 0.9 


4.273 


7.541 


3.637 


5.629 





205 





124 





391 





296 
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and here we have chosen the lowest values of 5, which are allowed. Our decision on the best 
TVVAR model is based on the following four criteria. 

1. low order models are preferable as they have fewer state parameters; 

2. 5 should not be too low, because then the covariance matrix of 9t will be too large; 

3. the MSSE vector should be close to [1 1 1 1]'; 

4. the MAPE vector should be as low as possible. 

Considering the above criteria we favor the TVVAR(2). Figure [2] shows the estimate of 
the observation covariance matrix S. From the right graph we observe that the estimate of 
the correlations of yu and yjt, given 9t are very high (close to 1) and this means that in 
forecasting; this provides useful information about the cross-dependence of the four metal 
prices over time. 




Figure 2: Estimate of the observation covariance matrix S = {0y}i,j=i, 2,3,4- The left graph 
shows the estimates of the variances an; the solid line shows the estimate of o"n, the dashed 
line shows the estimate of 022, the dotted line shows the estimate of 033, the dashed/dotted 
line shows the estimate of 044. The right graph shows the estimate of the correlations of 
yit\&t with yjt\0t (j = 2,3,4); the solid line shows the estimate of the correlation of yu\9t 
with 2/2* I @t j the dashed line shows the estimate of the correlation of yu\9t with yst\9t and the 
dotted line shows the estimate of the correlation of yu\9t with ya\9t- 

As mentioned before two competitive models to our TVVAR modelling for the LME data 
are the matrix- variate DLMs (MV-DLMs) of Quintana and West (1987) and the discount 
weighted regression (DWR) of Triantafyllopoulos (2006). Next we compare the TVVAR(2) 
model discussed above with these two modelling approaches. We start by briefly describing 
the MV-DLM and the DWR. 
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Table 6: Mean vector of squared standardized one-step forecast errors (MSSE) and mean 
vector of absolute percentage one-step forecast errors (MAPE) for the LME data and for 
three multivariate models: TVVAR, MV-DLM and DWR. 





MSSE 


MAPE 


TVVAR(2) 
MV-DLM 
DWR 


1.296 1.743 1.228 1.822 
1.306 2.436 0.984 1.887 
2.202 1.610 1.590 1.868 


0.065 0.057 0.116 0.095 
0.019 0.022 0.026 0.025 
0.013 0.015 0.017 0.017 



The MV-DLM is defined by 



Vi 



F'O t + e' t and @ t = GQ t -l + w t , 



(8) 



where F is a d x 1 design vector, Gf is a d x p state matrix, G is a d x d transition matrix, 
~ Ap(0, £) and vec(u;t)|£, fi ~ Afd p (0, £ <8) fi), where vec(-) denotes the column stacking 
operator of a lower portion of a matrix and (g> denotes the Kronecker product of two matrices. 
A prior inverted Wishart distribution is assumed for £ and the resulting posterior distributions 
as well as further details on the model can be found in Quintana and West (1987) and West 
and Harrison (1997, Chapter 16) (for more references on this model, see also the Introduction). 
In the application of MV-DLMs it is necessary to specify F and G. Following Quintana and 
West (1987), who consider international exchange rates data, and by consulting the plots of 
Figure Q] we propose a linear trend model for the LME data. Thus we can set 



1 




and G 



1 1 
1 



The DWR is defined by 



y t = y t -i + tpt + e t and ip t = A-i + Ct, 
with et|£ ~ Ap(0, £) and Q ~ Ap(0, fit). This model can be put into state space form as in 

F[Q t + e t , t 



yt = [y 



t-i 



i 



1 




i 




" o " 


. it . 






+ 





The covariance matrix fit is modelled with a discount factor 5 and £ is estimated following 
Triantafyllopoulos and Pikoulas (2002) and Triantafyllopoulos (2006). 

Table [6] shows the MSSE and the MAPE of the three models. We see that all models 
produce reasonable results. For the MSSE the best model is the TVVAR(2) (with the excep- 
tion of the lead variable where the MV-DLM produces MSSE closer to 1). For the MAPE 
the best model is the DWR with the TVVAR(2) producing the highest MAPE. Out of the 
three models, the MV-DLM is limited by its mathematical form, which is constructed to give 
conjugate analysis (see also the Introduction). The DWR suffers from similar limitations as 
the MV-DLM, but it provides good results, for linear trend time series without seasonality. 
The TVVAR model provides a good modelling alternative and considering the numerous ap- 
plications of VAR time series models in econometrics, it is believed that the TVVAR has a 
great potential. 
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In conclusion, the TVVAR model can produce forecasts with good forecast accuracy, 
while the correlation of the series can be estimated on-line with a fast linear algorithm. A 
criticism of the model is that its efficiency depends on its order and if high order TVVAR 
models are required (e.g. as in approximating moving average processes with time-dependent 
parameters) its efficiency will be similar of that of a vector MA, since the discount factor will 
have to be close to 1. It will be interesting to know how the order of the TVVAR model is 
related to the boundness of the eigenvalues of the covariance estimator St- 

Concluding Comments 

This paper develops an algorithm for covariance estimation in multivariate conditionally Gaus- 
sian dynamic linear models, assuming that the observation covariance matrix is fixed, but 
unknown. This is a general estimation procedure, which can be applied to any Gaussian 
linear state space model. The algorithm is empirically found to have good performance pro- 
viding a covariance estimator which converges to the true value of the observation covariance 
matrix. The proposed methodology compares well with a non-sequential state of the art 
MCMC estimation procedure and it is found that the proposed estimates are close to the 
estimates of the MCMC. The new algorithm is applied (but not limited to) model subclasses 
of VAR and VAR with time-dependent parameters (TVVAR), which have great application 
in financial time series. Considering the London metal exchange data, it is found that the 
TVVAR model has outstanding performance as opposed to the VAR model. It is believed 
that the development of the TVVAR model is a worthwhile project and the proposed fast, 
on-line algorithm for the estimation of the observation covariance matrix, is a step forward 
opening several paths for practical forecasting. 

The focus in this paper is on facilitating and advancing non-iterative covariance estimation 
procedures for vector time series. Such procedures are particularly appealing, because of their 
simplicity and ease in use. For such wide class of models such us the conditionally Gaussian 
dynamic linear models, the proposed on-line algorithm enables the computation of the mean 
vector of standardized errors as well as it enables the computation of the multi-step forecast 
covariance matrix. Both these computations are valuable considerations in forecasting and 
they attract interest by academics and practitioners alike. 
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Appendix 

Proof of Theorem Q] 

Let vech(-) denote the column stacking operator of a lower portion of a symmetric square 
matrix and let ® denote the Kronecker product of two matrices. First we prove that for large 
t, it is approximately 

E(S - Atete'tA^y 1 ) = E(S - Ate^M" 1 ), (A-l) 
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where At = n t ' S\^ 2 l Q t 1 ^ 2 . Conditional on E, we have from an application of the Kalman 
filter that Cov(eitejt, e^te^E, y t ~ 1 ) is bounded, where et = [e\t e 2 t ••• e p t]'. Since E is 
bounded, St is also bounded (lim^oo St = E), and so all the covariances of e^e^ and e^e^ 
unconditional on E are also bounded. This means that all the elements of Var{vech(ete£)} 
are bounded and so Var{vech(ete£)} is bounded. Now let 



X 1 =E(V-A t e t 44.\y t> y 



t-i> 



S t - A t e t e' t At 



and 



X 2 = E(E - A t e t e' t At\y t - 1 ) = S t -i - A t Q t A' t . 

Then, since lim^oo St = E, there exists appropriately a large integer t{L) > such that for 
every t > t(L) it is E(Xi - X 2 \y t ~ l ) » 0. Also 



Var{vech(^ 1 - ^a)!^ 1 } = Vax{(A ® A^v^M)!?/" 1 } = -> 0, 



"7 



with 



S' t 1 ZiQ7 1/2 ® ^iQt" 1/2 J D P [Var{vech(e i 4)|y t - 1 }] J D; Q^X^i Q^^l 



-l/2 e l/2 



-1/2 e l/2 



where D p is the duplication matrix and from the first part of the proof we have that Et is 
bounded. It follows that for any t > t{L) it is X\ ~ X 2 with probability 1 and so we have 
proved equation (|A-lj) . Using E(E|y ) = St, from equation (|A-lj) we have 

E(E|y*) - Ate t e' t A' t = E(E|y*" 1 ) - AM^tlv^A 



1 



St = St-i + -SlL\Q; ll2 {e t e't - Q t )Q t 1/Z S^ 
1 „ 1 



-1/2 c l/2 



St — S- 



— S t -i + —SllXQt 1/2 ete'tQ~ t 
nt n t 



1/2^1/2 



n t S t = nt-iSt-i + S^Qt e t e t Q t S^ = n S + ^ S i L x Q i e i e i Q i 



-1/2 c l/2 



t-1 



1/2^-1/2, 



-l/2 c l/2 



i=l 



and by dividing by n t = uq + 1 = n t -\ + 1 we obtain equation (J2]) as required. 



The Gibbs Sampler for Multivariate Conditionally Gaussian DLMs 

The following procedure applies to any conditionally Gaussian dynamic linear model in 
the form of equation (pQ). For the simulation studies considered in this paper, given data 
V = (yii 2/2) • • • 1 2Mr)) we are interested in sampling a set of state vectors, #i,#2 • • • >#iv 
and the observation covariance matrix E from the full, multivariate posterior distribution 

of e 1 ,e 2 ,...,e N ,j:\y N . 

Gibbs sampling involves iterative sampling from the full conditional posterior of each 
6t, \9—t) E, y , for all t = 1, 2, . . . , N, and S|#i, 9 2 , . . . , On, y N ; in our notation, 0-t means that 
we are conditioning upon all the components Q±, 9 2 , ■ ■ ■ , On but B%. Given the conditionally 
normal and linear structure of the system, such full conditional distributions are standard, 
and therefore easily sampled. However, such an implementation of the Gibbs sampler, where 
each component is updated once at a time, could be very inefficient when applied to the 
multivariate DLMs discussed in this paper; in fact, the high-correlation of the dynamic system 
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will most likely bring convergence problems. In order to overcome such difficulties, following 
the early suggestions of Carter and Kohn (1994) and Friihwirth-Schnatter (1994), we have 
chosen to implement a blocked Gibbs sampler Gamerman (1997, p. 149); within this context, 
this sampling scheme is better known as the forward filtering, backward sampling algorithm. 
Following is a concise description of the algorithm used in our studies; for more details, the 
reader should consult the references above, as well as West and Harrison (1997, Chapter 15). 

The first step of the Gibbs sampler involves sampling from the updating distribution of 
6n\Y,, y N , which is given by the multivariate normal Nd{m^j , Pjf)- This is done in the forward 
filtering phase of the sampler, as follows. Starting at time t = with some given initial values 
ruff , Pq 1 and E we compute the following quantities at each time t, for t = 1, 2, . . . , N: 

(a) the prior mean vector and covariance matrix of ^|E,y* , 

at = Gmf_ x and Rf = GP^G' + Q. 

(b) the mean vector and covariance matrix of the one-step ahead forecast of j/tly 4 " 1 , 

2/^(1) = F'a t and Qf = F'Rf 1 F + E. 

(c) the posterior mean vector and covariance matrix of 9t\y t , 

mf = a t + Afef and P t M = Rf - Af Q f (A?)', 

where Af = Rf 1 F{Qf 1 )~ 1 is the Kalman gain and ef 1 = yt — yf_i(l) is the one-step 
ahead forecast error vector. 

An updated vector 9n is thus obtained, and the filtering part of the algorithm is completed. 
The backwards sampling phase involves sampling from the distribution of 9t\0t+\, E,y* at all 
times t = N— 1, . . . , 1, 0. Each of such vectors is drawn from a multivariate normal N^Qit, Hf), 
where 

h t = mf + P t M G'(Rf +1 )- 1 (e t+1 -a t+1 ) and H t = P t M {I d - G^^GP^), 

with Id being the d x d identity matrix. At each time t, we also compute el = yt — F'Ot- Once 
the backwards sampling phase is completed, we set 

JV 

t=i 

Finally, with n(f being the prior degrees of freedom and being the prior estimate 
of E, we sample from the full conditional density of E|S,y , which is an inverted Wishart 
distribution ZyV p {n^ [ + N + 2p,NT,^ + n^S^f), whose simulation is also standard. This 
concludes an iteration of the Gibbs sampler. 
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